A survey of itemset mining

نویسندگان

  • Philippe Fournier-Viger
  • Chun-Wei Lin
  • Bay Vo
  • Tin C. Truong
  • Ji Zhang
  • Hoai Bac Le
چکیده

Itemset mining is an important subfield of data mining, which consists of discovering interesting and useful patterns in transaction databases. The traditional task of frequent itemset mining is to discover groups of items (itemsets) that appear frequently together in transactions made by customers. Although itemset mining was designed for market basket analysis, it can be viewed more generally as the task of discovering groups of attribute values frequently co-occurring in databases. Due to its numerous applications in domains such as bioinformatics, text mining, product recommendation, e-learning, and web click stream analysis, itemset mining has become a popular research area. This paper provides an up-to-date survey that can serve both as an introduction and as a guide to recent advances and opportunities in the field. The problem of frequent itemset mining and its applications are described. Moreover, main approaches and strategies to solve itemset mining problems are presented, as well as their characteristics. Limitations of traditional frequent itemset mining approaches are also highlighted, and extensions of the task of itemset mining are presented such as high-utility itemset mining, rare itemset mining, fuzzy itemset mining and uncertain itemset mining. The paper also discusses research opportunities and the relationship to other popular pattern mining problems such as sequential pattern mining, episode mining, sub-graph mining and association rule ∗School of Natural Science and Humanities, Harbin Institute of Technology Shenzhen Graduate School, China †School of Computer Science and Technology, University 2, Harbin Institute of Technology Shenzhen Graduate School, China ‡Faculty of Information Technology, Ho Chi Minh City University of Technology, Ho Chi Minh City, Vietnam §College of Electronics and Information Engineering, Sejong University, Seoul, Republic of Korea ¶Department of Mathematics and Informatics, University of Dalat, Vietnam ‖Faculty of Health, Engineering and Sciences, University of Southern Queensland, Australia ∗∗Faculty of Information Technology, University of Science, Vietnam

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

A New Algorithm for High Average-utility Itemset Mining

High utility itemset mining (HUIM) is a new emerging field in data mining which has gained growing interest due to its various applications. The goal of this problem is to discover all itemsets whose utility exceeds minimum threshold. The basic HUIM problem does not consider length of itemsets in its utility measurement and utility values tend to become higher for itemsets containing more items...

متن کامل

A Survey on Moving Towards Frequent Pattern Growth for Infrequent Weighted Itemset Mining

Data Mining and knowledge discovery is one of the important areas. In this paper we are presenting a survey on various methods for frequent pattern mining. From the past decade, frequent pattern mining plays a very important role but it does not consider the weight factor or value of the items. The very first and basic technique to find the correlation of data is Association Rule Mining. In ARM...

متن کامل

Mining High Utility Itemsets – A Recent Survey

Association rule mining (ARM) plays a vital role in data mining. It aims at searching for interesting pattern among items in a dense data set or database and discovers association rules among the large number of itemsets. The importance of ARM is increasing with the demand of finding frequent patterns from large data sources. Researchers developed a lot of algorithms and techniques for generati...

متن کامل

A Survey on Infrequent Weighted Itemset Mining Approaches

Association Rule Mining (ARM) is one of the most popular data mining technique. All existing work is based on frequent itemset. Frequent itemset find application in number of real-life contexts e.g., market basket analysis, medical image processing, biological data analysis. In recent years, the attention of researchers has been focused on infrequent itemset mining. This paper tackles the issue...

متن کامل

A Survey on High Utility Itemset Mining Using Transaction Databases

Data Mining can be delineated as an action that analyze the data and draws out some new nontrivial information from the large amount of databases. Traditional data mining methods have focused on finding the statistical correlations between the items that are frequently appearing in the database. High utility itemset mining is an area of research where utility based mining is a descriptive type ...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:
  • Wiley Interdisc. Rew.: Data Mining and Knowledge Discovery

دوره 7  شماره 

صفحات  -

تاریخ انتشار 2017